Skip to content

Kokkos support for Lagrange and Monomial elements #4441

Open
rochi00 wants to merge 9 commits intolibMesh:develfrom
rochi00:kokkos-fe-refactor
Open

Kokkos support for Lagrange and Monomial elements #4441
rochi00 wants to merge 9 commits intolibMesh:develfrom
rochi00:kokkos-fe-refactor

Conversation

@rochi00
Copy link
Copy Markdown

@rochi00 rochi00 commented Apr 22, 2026

  1. Shape functions — nativeShape, nativeGradShape for LAGRANGE (12 types) + MONOMIAL (orders 0-5)
  2. Type dispatch — FEShapeKey, classFromTopology, nDofs, templated nativeMapShape
  3. Quadrature — GaussQuadrature device-callable for EDGE, QUAD, HEX, TRI, TET + fillQuadrature host helper
  4. Physical map — physicalPoint, jacobian, physicalPointAndJacobian, faceJacobian, faceJxW, faceNormal (templated + runtime versions)
  5. Device-safe asserts — printf + Kokkos::abort() for debug builds on GPU

rochi00 and others added 7 commits April 27, 2026 13:46
- Add include/libmesh/kokkos/ with device-callable FE math headers:
  scalar_types.h, fe_types.h, fe_base.h, fe_evaluator.h,
  fe_lagrange_{1,2,3}d.h, fe_monomial.h, fe_face_map.h
- Add src/kokkos/fe_types.C with non-inline FEElemTopology/FEFamily
  conversion functions and nDofs overloads
- Add --with-kokkos=DIR configure option (m4/libmesh_optional_packages.m4)
  that defines LIBMESH_HAVE_KOKKOS and LIBMESH_ENABLE_KOKKOS conditional
- Conditionally compile src/kokkos/fe_types.C into libmesh (Makefile.am)
- Register HAVE_KOKKOS in include/libmesh_config.h.in template

All FE math types live in namespace libMesh::Kokkos.
- Remove tests for deleted functions: toKokkosTopology, toKokkosFamily,
  nDofs(FEFamily, FEElemTopology), and FEElemTopology enum contiguity
- Update FEElemTopology::* -> libMesh::ElemType values throughout
- Update FEFamily::* -> libMesh::FEFamily values in FEShapeKey construction
- Update FEElemTopology topo field -> libMesh::ElemType topo in test struct
- Replace nativeShape(ElemType,...) -> nativeMapShape(LAGRANGE_MAP,ElemType,...)
  since the ElemType overload was removed; nativeShape now takes FEShapeKey
- Replace nativeGradShape(ElemType,...) -> nativeGradMapShape(LAGRANGE_MAP,...)
The PETSc-bundled Kokkos has broken half_impl_t support with certain
CUDA toolkit versions. Define KOKKOS_HALF_T_IS_FLOAT before including
Kokkos_Core.hpp to bypass the incompatible half-precision code paths.
The tests do not use half-precision types.
GaussLegendre1D: 1-D Gauss-Legendre rules (1-7 points), device-callable.
GaussQuadrature: Full topology dispatcher with n_points/point/weight
methods for EDGE, QUAD, HEX (tensor product), TRI (Dunavant), and
TET (Keast/Walkington). All methods are LIBMESH_DEVICE_INLINE.
fillQuadrature: Host-side convenience wrapper using std::vector.
@rochi00 rochi00 force-pushed the kokkos-fe-refactor branch from b160299 to 1dc4dd1 Compare April 27, 2026 19:49
@rochi00 rochi00 marked this pull request as ready for review April 27, 2026 22:12
@moosebuild
Copy link
Copy Markdown

Job Coverage, step Generate coverage on 1dc4dd1 wanted to post the following:

Coverage

10e93d #4441 1dc4dd
Total Total +/- New
Rate 65.47% 65.47% +0.00% -
Hits 78228 78231 +3 0
Misses 41256 41253 -3 0

Diff coverage report

Full coverage report

This comment will be updated on new commits.

Comment thread include/base/libmesh_device.h Outdated
// License along with this library; if not, write to the Free Software
// Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA

#pragma once
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In practice I think literally every compiler we care about supports this (though Wiki lists one I've never heard of that doesn't!), but since it never made it into a C++ standard we've always used include guards instead; let's do that here too.

Comment thread include/base/libmesh_exceptions.h Outdated
#define libmesh_noexcept noexcept

#if defined(__CUDA_ARCH__) || defined(__HIP_DEVICE_COMPILE__)
// GPU device code does not support C++ exceptions — silently no-op.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly when we throw an exception it's because we can't sanely go on; even if we don't support exceptions on device I don't think silently going on is the right way to handle that. When we don't support exceptions on the host we print the error message and abort; should we not do that here too?

namespace libMesh {

/**
* \enum libMesh::FEElemClass groups element types by topological class,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept is good, but I don't like the name. FEFamily really is FE Method specific; an Elem is an Elem even if we're doing Finite Volume or pure geometry/meshing work (or in the future FDM stencils or whatever) instead. Let's just call this ElemClass, unless someone has an idea for a better name?

Comment thread include/gpu/kokkos_fe_base.h Outdated
//
// Order is only meaningful for MONOMIAL specializations.
// Lagrange specializations always use Order = 0 (the default) because the
// ElemType (e.g. QUAD9) already encodes the polynomial order.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it doesn't. p=1 Lagrange on a Quad9 makes perfect sense and is commonly used for the pressure variable in incompressible flow simulation.

I get that we're not supporting everything right out of the gate, but we need to make sure that we're not digging ourselves into any holes in our documentation or (more importantly, since it'll be much harder to change later) our APIs. Is there something in the design that would already prevent us from returning evaluations of a p=1 Lagrange on a higher-order geometric element?

Comment thread include/gpu/kokkos_fe_evaluator.h Outdated
// ── On-device helpers: element class -> spatial dimension ─────────────────────

LIBMESH_DEVICE_INLINE unsigned int
dimFromClass(FEElemClass cls)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to use snake_case rather than camelCase, like we do everywhere else in libMesh code.

We should also try to get this to resemble existing code in less superficial ways if possible. We've got static const unsigned int Elem::type_to_dim_map[] already, for instance; could we either add an Elem::class_to_dim_map[] too, or if elem.h would cause Kokkos fits, would it make sense for us to still use the same array-based design to make things easier to merge later?

Comment thread include/gpu/kokkos_scalar_types.h Outdated

// ---------------------------------------------------------------------------
// Free operators not covered by TypeVector/TypeTensor arithmetic
// (scalar±vector forms that don't exist in the base classes)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They don't, because that's kind of horrifying for type safety. Even if they did we'd probably decide that the proper projection of s from ℝ→ℝ³ was (s,0,0), not (s,s,s). That's not even of magnitude s!

Where do we use these? If the answer is "nowhere" let's delete them; if not let's give them new names so I can grep for those names and stare at the hits suspiciously.

parent_pt(0) = pt(0);
parent_pt(1) = pt(1);
parent_pt(2) = pt(2);
return parent_pt;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, what?

There are reference coordinates on a point element; they're all just (0,0,0).

But (x,y,z) is what point(0) will give you, and that is not (xi,eta,zeta). point(0) on a NodeElem will be a point in physical space.

Comment on lines +53 to +63
for (unsigned int k = 0; k < n; ++k)
{
const Real s = face_qpt(0);
const Real t = face_qpt(1);
const Real psi = nativeMapShape(mapping_type, side_topo, k, s, t, 0.0);

const auto & pt = side_in_parent.point(k);
parent_pt(0) += psi * pt(0);
parent_pt(1) += psi * pt(1);
parent_pt(2) += psi * pt(2);
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, it's midnight, and I'm getting tired of tearing my hair out, and I think I've already written enough to keep you going for a while, so I'm going to post this review half done, with one final issue:

Is this whole block not just wrong? It gives the location in physical space of face_qpt in side-reference-space, but the docs here say we're returning a point in parent-reference space. There has to be an inverse_map() afterward to get back to that.

Please tell me I'm wrong, and I just lose too many IQ points by midnight? We clearly need way more test coverage if I'm somehow our first line of defense for a bug this big. I know we don't have the environments set up to test in PR yet, but ideally you're running make check yourself and at some point in the new tests we're doing a side integral?

Comment thread .gitmodules
[submodule "contrib/metaphysicl"]
path = contrib/metaphysicl
url = ../../libMesh/MetaPhysicL
url = ../../rochi00/MetaPhysicL
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No. Definitely not.

Comment thread Makefile.am
Comment on lines +263 to +264
if LIBMESH_ENABLE_KOKKOS
endif
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, one more comment.

There's an interesting story about the rock band Van Halen:

When they were on tour, they had a fairly complicated and safety-critical set of contract stipulations that venues had to meet: heavy platforms and heavy equipment, serious electrical requirements, pyrotechnics, etc. And they had a little contract clause that demanded a bowl of M&Ms for them backstage, with all the brown M&Ms removed.

If someone challenged that clause, they waived it. They didn't care about not eating brown M&Ms. They just cared that the big long critical pile of contract text had actually been respected, entirely read by the people with the primary responsibility of reading it and making sure everything was set up correctly. If they hadn't been reading carefully enough to notice the brown-M&M-hating-divas bit, could they really be trusted to be getting the subtle, tricky mechanical and electrical and thermal stuff done and inspected safely? No; the band roadies would essentially have to reinspect everything to see which parts of it they'd have to redo.

It was not a good idea to be a venue that made the hard rock band and their annoyed roadies unable to trust any of your work.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I mostly paid attention to the tests that were written, and once they all passed assumed the internal code was correct. That was clearly not a good idea. I will review everything now. The tests clearly didn't have enough coverage

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The claude generated comments had so many issues, it is terrifying how much it overclaims

rochi00 added 2 commits April 28, 2026 13:57
…evice-safe abort

- Replace #pragma once with traditional include guards in all gpu/ headers
- Rename all camelCase functions to snake_case (libMesh convention)
- Refactor FEShapeKey to carry ElemType instead of FEElemClass
- Add lagrange_shape_topology_for_key() matching libMesh dispatch pattern
- LIBMESH_THROW on device: print + abort instead of silent no-op
- Remove lossy FEElemClass indirection from dispatch path
- Simplify Real3 to alias of RealVectorValue
- Add comprehensive physics shape tests for exact Lagrange keys
Deduplicate Gauss quadrature point/weight tables between the CPU
quadrature sources (quadrature_gauss_{1,2,3}D.C) and the GPU
kokkos_quadrature.h header by extracting them into a shared
device-callable header. Also extends kokkos_fe_types.h with
additional n_dofs support and refines face map implementation.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants